-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor new action insertion code so duplicate actions will not exist + provide means to fix duplicates #7112
Conversation
I remember a discussion about a UNIQUE index, the reason it wasn't implemented that way was because of the potential index size right? Regarding the patch I don't have a lot of feedback, I'm not familiar at all with the tracker. Is the additional query a problem? I guess it's searching against the whole table, so I wonder how it performs. |
Do you mean the query after an INSERT? There's an index on hash + type, so it should be fast. I'm not positive about the group by's but I think the index will be used there too, since it should group by type + hash then name. EDIT: Yeah, I was going to put a UNIQUE index, but there's no size limit on the action name, and unique indexes would require one. Since URLs can be large, a realistic limit would result in a very large index (I think). |
Do you mind writing tests for the
Maybe we can ask PRO to run such a query on a big instance with Also not sure re the Update. I would probably just remove it and if someone wants to run it, they can use the command. Should be announced in the release blog post and mentioned in FAQ |
@tsteur Added tests, still waiting for EXPLAINS results. Mind performing another review in the meantime? |
Looking for duplicated action ids this way looks like a pretty good solution to me 👍 It shouldn't cause much overhead as it is only done once on each action creation anyway. The Otherwise 👍 |
…ays using least idaction and deleting duplicates when they are found to be inserted. Since tracker process can potentially fail before duplicates are removed, added test to make sure reports work when duplicate actions exist in the DB.
…dd time elapsed to whole command.
…group bys in Tracker/Model.php and TableLogAction.
…ion fixing command.
9e835c5
to
62af9ad
Compare
Note: I removed some private methods, but I don't think the SQL is actually useful enough to move to another class (in a plugin or in core). The queries could only be used for detecting and fixing duplicate actions. |
It's not only whether it can be reused by core or plugins. It is also that the SQL statements just rather belong to a "model" and it makes them better testable. Also you never know whether it can be maybe reused in the future. If anyone has a use case he/she would probably not find it in that class but maybe in the model... Eg those methods could be maybe useful for other cases in the future one never knows: https://github.com/piwik/piwik/pull/7112/files#diff-23c6df91cba3becc700f91001c4e39bbR203 or https://github.com/piwik/piwik/pull/7112/files#diff-23c6df91cba3becc700f91001c4e39bbR172 . At least the SQL could be extracted in another class in that plugin. Doesn't have to be core Can you maybe use bind params here? https://github.com/piwik/piwik/pull/7112/files#diff-23c6df91cba3becc700f91001c4e39bbR179 and here https://github.com/piwik/piwik/pull/7112/files#diff-23c6df91cba3becc700f91001c4e39bbR268 or maybe convert Ids to integer? I know we only get numbers from the database but it will prevent any possible injections in the future in case someone wants to reuse it somewhere else or so and it doesn't hurt. |
…ed DuplicateActionRemover into a DAO that only deals w/ duplicate actions, created two new DAO classes (one for log_action manipulation and one for getting relational table metadata), deprecated SHOW COLUMNS method in Db in favor of new table metadata DAO, and added & fixed tests.
$result[$table] = $columns; | ||
} | ||
return $result; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the $tablesWithIdActionColumns
is hardcoded, couldn't it be the same for the columns? Or can they be dynamically added by plugins maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can be added via dimension classes (eg, ContentName). I don't know if there are any non-core ones that add idaction columns, but many core ones do so I guess it's conceivable that they exist. Also, I guess it's possible for some plugins to be disabled in which case the columns wouldn't exist.
if ($logger === null) { | ||
$logger = StaticContainer::get('Psr\Log\LoggerInterface'); | ||
} | ||
$this->logger = $logger; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you prefer those if statements could be eg:
$this->archiveInvalidator = $invalidator ?: new ArchiveInvalidator();
You could also get all those dependencies from the container so that you can use dependency injection without having null
in constructors, e.g.:
class DuplicateActionRemover
{
// ...
public function __construct(TableMetadata $tableMetadataAccess, LoggerInterface $logger)
instead of:
public function __construct($tableMetadataAccess = null, $logger = null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also get all those dependencies from the container so that you can use dependency injection without having null in constructors
Would I have to access StaticContainer directly? Maybe it's possible to do it once for the console command? That would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would definitely be nice!
The problem lies with how Symfony's Console work: we need to register command objects, so that means creating every single command just to init the console (see symfony/symfony#12063).
So if we start doing dependency injection in commands, that means every dependency will be created and injected, with dependencies of dependencies, etc... So 1 or 2 command isn't a big deal but we risk running into a case where a huge load of objects (maybe most of the application) are created by the container just to end up using a few of them…
PHP-DI offers to do lazy injection but:
- we can't use that in Piwik since it requires a lib that requires PHP > 5.3.20 or something
- it would mean marking most services as lazy, which doesn't make any sense
- this is not a clean solution anyway…
So in the meantime we are stuck with doing service location instead of dependency injection in commands…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way we could do something a tiny bit cleaner by introducing a get()
method on commands that would proxy that to the container, e.g.:
$logger = $this->get('Psr\Log\LoggerInterface');
(like in Symfony's controllers)
That's still service location, it's just a few characters shorter (and no static calls to StaticContainer
)… I don't have a strong opinion on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mnapoli Is it possible to inject dependencies in-place? Ie, after an object has been created? We could maybe create commands w/o setting dependencies and then before executing a command, we resolve the dependencies and call Command::execute. The dependencies would have to be specified via the @Inject
annotation I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's possible ($container->injectOn($obj)
) but we decided not to use annotations last time it was discussed.
Another option would be to keep commands as simple as possible and move everything into classes, though that's very far from the current reality in the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a proxy command class can be used in core\Console.php and the actual command instance wouldn't be created until the proxy's execute() method is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commands are instantiated because Symfony's Console needs to read the options/configuration. Even if we had a proxy commands would still be instantiated, unless we change the way commands are configured (i.e. we diverge from Symfony's official way) which would be fine by me (given this problem is pretty important)…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about it that sounds a good idea, I think I'll have a look at it (having DI only in controllers is really limiting…)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be awesome if it works :)
Looks good to me! ready to merge? |
Refactor new action insertion code so duplicate actions will not exist + provide means to fix duplicates
Well done @diosmosis It's a real relief that this bug will be fixed! 👍 |
This pull request modifies the tracking logic so duplicate actions will not exist in the log_action table and so even if duplicate actions exist in the log_action table, only one action ID will be used at any time.
Fixes #6436
Changes include:
@mnapoli @tsteur @mattab What are your thoughts?